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ABSTRACT 

Writing researchers have developed various methods for investigating the writing process 
since the 1970s. The early 1980s saw the occurrence of the real-time computer-aided study 
of the writing process that relies on the protocols generated by recording the computer 
screen activities as writers compose using the word processor. This article reviews literature 
on that approach to studying the writing process. The article begins with defining the real¬ 
time computer-aided study of the writing process, tracing its historical development, and 
explaining the advantages it offers, then it gives a brief description of the software that has 
been used in the computer-aided writing process research and discusses the ways of 
analyzing the logged data, and it ends with overviewing the computer-aided writing process 
research. 1 
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I. INTRODUCTION 

Writing is a cognitively demanding process in which a lot of strategies are used. The 1970s 
and 1980s witnessed a shift in teaching writing from emphasis on the product of writing 
activities to emphasis on the process of developing that written product. The emergence of 
the process movement in both first language (LI) and second language/foreign language 
(L2/FL) writing was influenced by some early research on LI composing, specifically 
Emig’s (1971) seminal study and Flower and Hayes’s (1981) cognitive model of writing. 
These early studies have indicated that writing is best understood as ‘a set of hierarchical 
and recursive thinking processes’ guided by the growing network of goals generated or 
adapted by writers (Flower & Hayes, 1981: 366). The assumptions underlying writing 
process research are that examining students’ written products tells us very little about their 
instructional needs and that effective teaching of writing needs to be based on knowing how 
writers compose their texts. By investigating the writing process, researchers try to find out 
how writers develop their texts and what kind of strategies, i.e. planning, retrieving, 
reviewing, monitoring and revising, they employ while composing. Writing process research 
can infonn us about the strategies used by good and poor writers and the different thinking 
patterns involved in composing the text, and about the difficulties students may encounter 
while composing; thus we can adapt our teaching methods to meet students’ writing needs. 
Pedagogically speaking, assessing the writing process, which is a main component in 
writing process instruction, is of utmost importance as it can be used for raising their 
consciousness about good writing strategies and for training students in using them. 

Since Emig’s (1971) seminal work on the writing processes of her twelfth grader 
subjects, increasing attention has been given to investigating the way writers compose their 
texts. The early writing process studies that occurred in an infrequent and rare way in the 
1970s paved the way for a growing number of studies on the area since the early 1980s and 
until the present time. This growing number of studies on students’ writing processes has 
informed researchers (i.e. Bereiter & Scardamalia, 1987; Chenoweth & Hayes, 2001; Flower 
& Hayes, 1981; Grabe & Kaplan, 1996; Kellogg, 1996) in building their own theories or 
models of the cognitive process of writing. The massive shift from focusing on the product 
approach to the process approach in writing research was also accompanied by incorporating 
new research methods in the writing area. Writing researchers have developed different 
methods and techniques for collecting and analyzing the writing process data, including the 
think-aloud method, writers’ retrospective accounts stimulated by their texts or the video or 
audio recording of the writing session, questionnaires, process logs, text analysis, 
naturalistic observation, video-based observation, and the real-time computer-aided study of 
the writing process which is an observation-based method. This last method, the real-time 
computer-aided study of the writing process, will be discussed in detail in this article. The 
article traces the historical development of the real-time computer-aided study of the writing 
process and briefly describes some of the software used for observing and analyzing the 
logged data, then it reviews the ways of analyzing the logged data and the research 
employing the method to investigate the target area. 
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II. REAL-TIME COMPUTER-AIDED STUDY OF THE WRITING PROCESS: 
DEFINITION, HISTORICAL DEVELOPMENT, AND RATIONALE 

The computer has increasingly been incorporated in writing research and teaching since its 
large-scale introduction into education in the late 1970s. The advent of computer-based 
writing has created new areas for investigating the writing process and has led researchers to 
develop new methods for observing the real-time writing process (Spehnan Miller, 2000: 
127). The real-time computer-aided approach to studying the writing process can be defined 
as observing and analyzing the online writing process through recording computer screen 
activities, i.e. the keyboard presses and cursor movements, scrolling, the timing of each 
movement and pauses between these movements. In light of this definition, there are two 
dimensions of that approach to studying the writing process: observing or recording the 
writing process and analyzing the recorded data. The real-time computer-aided writing 
process can be observed via two methods: a) using a camera to record the computer screen 
activities; and b) using a computerized programme to record keystroke logging. At present, 
the latter method is more commonly used by writing researchers than the former. That is 
why some terms such as Togged data’ and ‘keystroke logging’ will be used in this chapter to 
refer to both methods. Some of the keystroke logging programmes work independently of 
word processors while others function within the word processor used. Most keystroke 
recording programmes have a replay feature by which the text can be played back in real 
time. The data generated by either method, which may be called the real-time writing 
protocol, the computer-generated protocol or the logged data, is often saved as log-files for 
later processing. As for the computer-aided analysis of the writing process, this means 
analyzing the logged data using some software developed for that purpose. 

The real-time computer-aided approach to studying the writing process occurred in 
the early 1980s. It can be argued that this approach was influenced by Matsuhashi’s two 
early reported studies (Matsuhashi, 1981; Matsuhashi & Cooper, 1978) which derived their 
observational method from research on the temporal aspects of speech production. In these 
two studies, the video time-monitored observation was employed to investigate pausing in 
writing. The subjects transcribed their texts on a specially-sized pad placed on a desk. A 
video camera was used to focus on the pad and it sent signals to a special effects generator 
allowing these signals to be recorded concurrently and be replayed on a split screen, and to a 
data time generator that recorded the real-time of the writing session in minutes, seconds, 
and tenths of seconds. The data in the two studies were analyzed in terms of pause time, 
transcribing time, pause length, pause location and words per minute. Though the real-time 
computer-aid method was introduced to the writing process area in the early 1980s, a few 
studies employed this method were conducted during that decade. Collier’s (1983) study 
might be the earliest one to video-record the computer screen activities. Another early study 
that employed the computer screen video-recording was reported by Benesch (1987). As for 
using computerized programmes to record the computer screen activities, an early attempt 
was made by Bridwell-Bowles, Sire and Brooke (1985) who examined students’ writing 
using Playback programme for creating keystroke data. Other early studies using keystroke 
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logging include those ones reported by Balkema (1985), Bridwell-Bowles, Johnson and 
Brehe (1987), Flinn (1987a; 1987b), Lutz (1987) and Sire (1988). It could be noted that all 
of these previous studies that employed the keystroke logging method were conducted in the 
LI writing context. L2 or FL writing studies using keystroke logging seem to have occurred 
only in the 1990s, a decade that saw an increasing use of real-time methods in writing 
process research as desktop computers and keystroke tracking software became ubiquitous 
(Levy & Olive, 2001: 4). Since the early 1990s, new keystroke logging software has been 
introduced and some computer-aided analysis programmes have been developed as well. 

The real-time computer-aided observation of the writing process offers researchers a 
lot of advantages. This data storage process is unobtrusive, i.e. it does not interfere with 
using the word processor (Leijten & Van Waes, 2005). Being an unobtrusive data collection 
method, computer-aided observation overcomes the potential problems involved in using 
other instruments, e.g. think-aloud method, retrospective interviews and questionnaires, for 
collecting data about the writing process such as reactivity, difficulty of retrieval and social 
desirability. Replaying the keystroke-recorded writing sessions allows us to know not only 
the problems writers encounter in their writing but also their problems in using the computer 
(Lansman, Smith & Weber, 1993: 89). It provides us with a clear understanding of the 
dynamics of text production (Spehnan Miller & Sullivan, 2006: 4) and with a distribution of 
time and effort allocated to writing subprocesses (Levy & Ransdell, 1994: 219). One main 
advantage offered by the real-time computer-aided observation of the writing session is that 
the data recorded can be archived and studied by other researchers (Levy & Ransdell, 1996: 
160). The logged data is a rich source for studying writing process, particularly those aspects 
related to writing fluency and temporal aspects of writing such as pausing and the timing of 
the writing activities. When using this kind of observational data to complement other data 
sources, this enriches our understanding of the complex processing involved in writing 
(Ransdell, 1995: 97). It also fits well within the case study and ethnographic approaches to 
studying the writing process (Flinn, 1987a: 42). Protocols generated by logging software 
bridge the gap between the case study approach and the quantitative approach to 
investigating the writing process as they allow researchers to examine writers’ strategies in 
detail and to quantify them (Lansman et al., 1993: 89). Finally, real-time computer-aided 
observation can be used with writers of different ages and different language developmental 
levels. 

However, keystroke logged data is not without its criticisms. The main criticism of 
this type of data is the difficulty of interpreting it due to lack of information about writers’ 
internal cognitive processes. The other difficulties involved in assessing the writing process 
using the logged data and the ways of addressing these difficulties are highlighted in 
sections IV and VI, respectively. The next section briefly describes some of the software 
writing researchers have used to observe the writing process and to analyze it. 
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III. SOFTWARE USED FOR OBSERVING AND ANALYZING THE WRITING 
PROCESS 

Software employed in studying the writing process can be classified into two categories: 
software used for recording the computer screen activities (digital cameras or keystroke 
logging software) and software used for analyzing the computer-generated protocols. In the 
following paragraphs, some of the computerized programmes of both categories used in the 
writing process research are briefly described. 


III.l. Programmes Used For Recording the Computer Screen Activities 

Early software developed for recording keystroke movements include: a) COMPTRACE: a 
modified version of the MILLIKEN Word Processor for the Apple He that records 
keystrokes and replays the composing session (Flinn, 1987a, 1987b); b) Writing 
Environment (WE): software that is implemented on UNIX workstations and has four 
system modes (Network Mode, Tree Mode, Edit Mode and Text Mode) appearing in four 
windows on the computer screen, with each of which writers can view their texts differently 
and perfonn different operations as well; c) Keytrap: a resident programme that records all 
keystrokes and the pauses between them and analyzes some aspects of the writing process 
(Janssen, Van Waes & Van den Bergh, 1996); and d) ScreenRecorder: a camera and macro 
that functions within MediaTracks, a presentation graphics programme, by capturing the 
computer screen activities invisibly (Radziemski, 1995). Win What Where Investigator and 
Camtasia Studio Software are two recent programmes that have been used in some studies 
(e.g. Figueredo, 2006; Youngquist, 2003), though not described in detail by the authors who 
reported using them. 

ScriptLog and Inputlog are two recently developed logging programmes. ScriptLog 
has three main modules: a design module in which the task requirements are identified, a 
recording module providing a binfile comprising the writing session activities and their 
timing, and an analysis module allowing researchers to extract selected patterns from the 
binifile (Stromqvist, Hohnqvist, Johansson, Karlsson & Wengelin, 2006). As for Inputlog, it 
is developed for Windows environments. In developing Inputlog, Leijten and Van Waes 
(2005) derived some of the characteristics of two earlier programmes (JEdit and Trace-it), 
on the one hand, and ScriptLog on the other hand. The aim of designing Inputlog is to 
analyze several aspects of the writing process quickly and accurately. The programme can 
perform the following functions: recording the writing session data in Microsoft Word, 
generating data files for analyses, playing the recorded session in different rates of speeds, 
and capturing the dictated input using speech recognition software. What distinguishes 
Inputlog from Trace-it and ScriptLog is it records keystroke and mouse movements 
independently of the word processor used and it can also record other actions such as using 
online or programming reference features. In addition, Inputlog can analyze the writing 
process aspects in different ways. Leijten and Van Waes (2005: 16) point out that some new 
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components are planned to be integrated in the next version of Inputlog; these are: a revision 
analysis component, a progression analysis component, and a speech recognition or dictated 
text component which will allow researchers to explore its effect on the writing process and 
may also offer them the possibility of simultaneously transcribing thinking-aloud 
verbalizations and retrospective interviews. 

Writing researchers have reported using a few computerized programmes for 
recording the non-English word processors data. Ta Kupu and Systeme-D are two of these 
programmes. Ta Kupu is word processing software for the Maori language and it also 
includes a data logging, and an evaluation and analysis tool, Tirohia, which can be used for 
replaying a record of the student’s interaction in real time and examining interaction log Hies 
at a fine-grained level (Barbour, Cunningham & Ford, 1993). Systeme-D, on the other hand, 
is a French word processing programme developed by Noblitt, Sola and Pet (1987). Using 
this programme, writers of French access some referencing features, including a bilingual 
dictionary (French-English), a verb conjugator, a reference grammar, a vocabulary index and 
a phrase index. Systeme-D has a query tracking device that generates a log of the 
infonnation accessed by writers while composing on the computer and gives a record of the 
following data: a) the time of beginning and ending the writing task; (b) the order of 
infonnation accessed in real time; (c) dictionary searches in both French and English; and d) 
other grammar and vocabulary inquiries (Scott & New, 1994; New, 1999). 

Another category of the logging software is used for examining specific aspects of 
the writing process or the writing process in a specific mode. Two types of this category can 
be identified in the literature: reaction time software, and translation process software. Some 
writing researchers have used reaction time tasks or triple tasks to measure the amount time 
and efforts allocated to the writing subprocesses. The triple task, a procedure proposed by 
Kellogg (1986), incorporates three tasks: a writing task, a reaction time task (auditory signal 
detection) and a directed retrospection task. Using this technique, participants are asked to 
compose on the computer as a primary task and to respond as quickly as possible to an 
auditory probe (e.g. a beep or a tone) by pressing a labeled button on a response category 
box or key in the keyboard to identify the writing subprocesses (planning, translating, 
reviewing, and other) in which they engage (Olive, Kellogg & Piolat, 2001). Participants are 
instructed to use the planning category when they create or organize ideas and set global or 
local goals, the translating category when they put ideas into words, the reviewing category 
when they read the text written, evaluate the text or plans and detect errors, and the other 
category when they engage in any other thoughts that do not fit within the three previous 
categories (Kellogg, 2001). Thus, the triple task technique makes use of immediate testing 
with verbal protocol and directed retrospection while composing and provides an estimate of 
the processing time devoted to each of the writing subcomponents (Olive et al., 2001). Some 
other computerized programmes have been developed to employ that technique such as 
PASCAL (Kellogg, 1987, 1988) and SCRIPTKELL that runs on a Macintosh machine and 
automatically records and analyzes the aspects constituting the dependent variables of 
Kellogg's procedure such as number of reactions, frequency of category choices, mean 
reaction times and mean difference scores (Piolat, Olive, Roussey, Thunin & Ziegler, 1999). 
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Translog is a programme developed for collecting real-time data about the cognitive 
aspects of the translation process, i.e. translating from one language to another, that could 
supplement other data gathered by introspective and/or retrospective instruments. The main 
assumption behind developing Translog is that by examining the temporal patterns of typing 
and pausing during the translation process, we could reach a better understanding of the 
dynamic interaction of the processes involved in it. Translog can also be used for recording 
most writing tasks, including the post-editing of a machine translated text. The programme 
has two main components: a Supervisor component that creates files, and a User component 
that runs these files and creates logfiles. The two components have five functions: three 
performed by the Supervisor component (preparing project, displaying the logfile content 
and analyzing it) and two performed by the User component (displaying source text and 
receiving textual input, and logging keystroke real-time data) (Jakobsen, 2006). 


III.2. Logged Data Analysis Programmes 

Another type of the software used for studying the writing process is the computerized 
programmes that can analyze the logfiles generated. S-notation, Progression Analysis and 
LS Graph are three examples of these programmes. 


III.2.1. S-notation 

This software is derived from the manual notation of the videotaped handwritten revision 
developed by Matsuhashi (1987) in which she transcribed the successive changes made by 
her subject to her texts and the place and order of these changes. It simplifies the analysis of 
online-revision by automatically identifying connected episodes of revision in readable 
notation, displaying the successive changes within a text. The resulting representation helps 
researchers to overcome problems encountered when analyzing raw keystroke logfiles and 
to analyze the changes made by writers both qualitatively and quantitatively (Kollberg & 
Severinson-Eklundh, 2001; Severinson-Eklundh & Kollberg: 2003). In order to create an S- 
notation file, the logfile is transformed to a Move-Insert-Delete (MID) file, independent of 
the word processor used, which includes a list of elementary keystroke operations (moves, 
insertions and deletions) in the order writers make them. Trace-it, general-purpose software 
developed for investigating writing strategies, presents the MID file generated by the S- 
notation in two windows: one for the revision record generated by the S-notation and the 
other for the final text produced. Trace-it has a replay feature which helps researchers to 
manually code the revisions and it can identify three types of revisions: repetitive revisions 
at one cursor location, embedded revisions and sequence of revisions in a previously written 
text (Severinson-Eklundh & Kollberg, 2003). S-notation can also identify the revisions 
generated by JEdit which functions as a logging word processor and runs on a Macintosh 
machine. 
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III.2.2. Progression Analysis 

Perrin (2001, 2003) has presented Progression Analysis which is derived from S-notation 
and provides the progression of the writing session in several episodes. It incorporates 
keystroke logging analysis with retrospective accounts to interpret the writing process. 
Progression Analysis is a multilevel method for investigating the writing process as it 
examines three levels of the writing process, i.e. its context (macro level), the development 
of the text or its progression (meso level) and the use of writing strategies (micro level). 


111.2.3. LS Graph 

This programme represents the logged data graphically. One main feature of the graphical 
representation provided by LS Graph is the mult-layered information it presents, e.g. 
revision analysis, manual analysis of logged data and verbal data analysis. LS Graph can be 
used in both reseacrching and teaching writing to identify the different patterns of writing 
strategies used (Lindgren & Sullivan, 2002). Lindgren, Sullivan, Lindgren and Spehnan 
Miller (2007) have presented an expanded version of LS Graph, Geographical Infonnation 
Systems (GIS), which is used for visualizing spatial and temporal information about 
cognitive activities and for creating infonnation layers from large quantities of data. GIS 
helps in identifying the time, the location and the way various writing cognitive activities 
occur and in understanding the intra- and inter-individual differences in writing processes. 

The following section discusses the different ways of analyzing the data generated 
by the logging software. 


IV. ANALYZING THE WRITING PROCESS LOGGED DATA 

The keystroke logged data are saved as logfdes which have records of pauses, deletions, 
insertions and cursor movements, etc. The analysis and the interpretation of the logged data 
is not as easy process as it might seem due to lack of information about internal cognitive 
processes (Lindgren, 2005). The different word processors used in the previous studies 
increase the difficulties of comparing the taxonomies proposed by writing researchers. In 
addition, addressing the typographical corrections and the processor functions is another 
thorny issue in analyzing the logged data of the writing process. However, there are specific 
trends for analyzing the logged data of the writing process that can be identified in the 
previous studies. Generally speaking, taxonomies used in the previous computer-aided 
writing studies have focused on analyzing the logged data in three different ways: analyzing 
the whole writing process, analyzing the revision component and analyzing the temporal 
analysis of the writing process. Examples of the taxonomies belonging to these three 
categories of analyzing the logged data are given below. 
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IV. 1. Analyzing the Whole Writing Process 

Some studies employing the keystroke logging method used taxonomies to analyze the 
writing process as whole. An example of these taxonomies is Bridwell-Bowles et al.’s 
(1987) which includes the following types of writing behaviours: pausing, text production, 
editing, revision, cursor movement, scrolling and combinations of more than one operation. 
A different taxonomy that has been used for analyzing the ScreenRecorder data was 
developed by Owston, Murphy and Wideman (1992). Their taxonomy has four main 
categories which are based on keyboard actions; these categories are: a) text scanning and 
cursor movements (backward cursor, forward cursor, page up, page down, beginning point, 
endpoint and block highlight); b) menu-bar icon use (Apple, File, Edit, Windows, Search, 
Format, Spelling and Macros); c) text deletion (Backspace and Block); and d) text addition 
(Insert and Text Entry). Another taxonomy integrating a component for using computer 
referencing features was developed by Scott and New (1994). This taxonomy includes two 
categories: a) strategies considered to be effective for foreign language writing (adherence to 
guidelines, example inquiry, conjugation inquiry, French dictionary inquiry, circumlocution, 
browsing, error avoidance, a recursive approach and final revising); and b) other FL writing 
aspects evaluated (English dictionary dependence, quality of lexis and the time spent on the 
task in relation to the number of lines produced). The first category of strategies were given 
a frequency score of 1 = Never to 5 = Very often. 

Combining the keystroke recorded data with verbal protocols, Levy and Ransdell 
(1994) developed a coding scheme of the writing process which has two main categories, 
one for the written protocols and the other for the verbal protocols. The written protocol 
category includes typing, deleting, superficial errors, meaningful changes, pausing at various 
levels of the text produced, any movement in text, and new paragraph. The verbal protocol 
category encompases pausing in speech, writing content, speaking and writing the same 
content, planning future topic content and rereading text written. In a later study, Levy and 
Ransdell (1995) developed their combinational response patterns for determining the writing 
subprocesses. These patterns include responses from writing protocol scored from visual 
track, pausing or starting new paragraph, typing, deleting, making meaningful or non¬ 
meaningful changes and any cursor movement. 

Recently, Leijten and Van Waes (2005) developed categorization and transcription 
models for analyzing the composing process of their subjects. The categorization model is 
used for describing the different aspects of the writing process, i.e. writing modes, technical 
problems, revisions and pauses. The S-notation-based transcription model, on the other 
hand, provides a multi-layered linear representation of the writing process. 


IV.2. Analyzing Online Revision 

A main focus of the real-time computer-aided studies is analyzing the revisions made by 
writers while composing using the computer. Some real-time computer revision studies (e.g. 
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Lindgren & Sullivan, 2003) have used those taxonomies developed for handwritten revision. 
However, the majority of these studies have analyzed the logged data of revision in terms of 
computer-related categories. The following three taxonomies are examples of how real-time 
revisions were analyzed. 

New (1999) has developed a Systeme-D-based taxonomy of revision which includes 
four categories, three of which are related to the type and level of changes made to the text 
(formal changes, meaning changes and length of changes) and one is related to the use of 
Systeme-D functions or inquiries. The formal changes category includes: spelling, tense, 
number, modality and word form, abbreviation, contraction, punctuation, capitalization, 
paragraph format, other format such as spacing and indent, typographical corrections, 
grammar and no change. The meaning changes category encompasses: addition, deletion, 
substitution, permutation, distribution and consolidation. The length of changes category has 
the following types: graphical change, lexical change, phrasal change, clausal change, 
sentence change and multi-sentence change. The last category, Systeme-D functions, 
includes: both English and French dictionary inquiries (example and conjugate, note and 
scroll), index (vocabulary, phrase and phrase) and save. Van Waes and Schellens (2003) 
have analyzed online revisions in terms of seven categories. These are: a) number of 
revisions; b) type of revision (addition, deletion, substitution and reordering); c) level of 
revision (letter, word, phrase, sentence, paragraph, layout and punctuation); d) purpose of 
revision (correction of typing errors, revision of form and revision of content); e) location of 
revision (title, first paragraph, first sentence of paragraph and elsewhere); f) remoteness of 
revision as measured in tenns of the number of lines above or below the point of inscription; 
and g) temporal location of revision (stage, segment and unit). A recent keystroke data- 
based revision taxonomy was used by Stevenson, Schoonen and de Glopper (2006). The 
four dimensions of their multi-dimensional revision taxonomy are: a) orientation (content, 
language and typing); b) domain (clause and above, below-clause and below-word); c) 
location (pre-text, point of inscription and previous text); and d) action (addition, deletion, 
substitution and other). 


IV.3. Analyzing the Temporal Aspects of the Writing Process 

Real-time computer-aided writing is a rich area for examining and analyzing the temporal 
aspects of the writing process because the logged data provide an accurate account of the 
timing of each key press, cursor movement and pause. A main focus of temporal analysis of 
the logged data is the location of pauses. Levy and Ransdell (1994) analyzed pause location 
in terms of pausing within a word, pausing within a clause or a sentence, pausing within a 
paragraph and general pausing between paragraphs. Similarly, Van Waes and Schellens 
(2003) identified pauses within the sentence, at sentence boundaries, and at paragraph 
boundaries. On the other hand, Spehnan Miller (2000) defined pause location based on 
potential completion points at a number of levels: character, word, intennediate constituent, 
clause and sentence. 
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Van Waes and Schellens’s (2003) more comprehensive taxonomy of the writing 
process temporal aspects includes the following categories: a) ratio of time spent pausing to 
time spent actively writing; b) number of words in the final text; c) duration of pauses; d) 
number of pauses; e) type of pause (formulation pauses and revision pauses); f) linguistic 
location of pauses (character, word, clause and sentence); and g) temporal location of 
pauses. The two authors constructed their subjects writers’ profiles based on analyzing the 
variables related to three aspects of the writing process: time spent on writing the final 
product, total duration of the writing process, and duration of each stage of the process 
(stage 1 from starting to completing the first draft and stage 2 from completing the first draft 
to completing the final version). Combining pause with edits, Epting’s (2004) taxonomy has 
these categories: a) question reviews (number of reviews and total review time); b) pause 
measures (pre-response time, post-response time, average pause length, total number of 
pauses and total number of non-stop pauses); c) pause associated edits (PAEs) (deletions, 
substitutions, insertions, backspace and total PAEs); d) edits without pause (EWPs) 
(deletions, substitutions, insertions, backspace and total PAEs); and e) total edit indices 
(total PAEs and EWPs, and keystroke/released characters). Hayes and Chenoweth (2006) 
analyzed some other temporal and logged data aspects; these include: a) transcription rate 
(words typed per minute); b) error rate (uncorrected errors per 100 words); c) correction rate 
(corrected errors per 100 words); d) wasted keystrokes (percentage of total keystrokes 
devoted to deleting errors and to typing the errors that were corrected); and e) number of 
bursts in each trial (a burst is a period of continuous typing followed by a spontaneous pause 
of 2 or more seconds in which no typing occurred). 

In the following section, the author briefly reviews the previous studies employing keystroke 
logging in investigating the writing process. 


V. COMPUTER-AIDED WRITING PROCESS RESEARCH: AN OVERVIEW 

The previous studies using keystroke logging in investigating the writing process can be 
classified into five categories: a) studies on revision; b) studies on the temporal aspects of 
the writing process; c) studies on using the logged data to stimulate writers’ retrospection; d) 
studies on the writing process as a whole; and e) studies on the other aspects of the writing 
process. 


V.l. Studies on Revision 

A large number of studies have used keystroke logging to investigate revision. Some of 
these studies were conducted in the LI context, including Bonk and Reynolds’s (1990) 
intervention study on the revisions made by students in five tasks, Epting’s (2004) 
investigation of the revising strategies of college students, and Severinson-Eklundh and 
Kollberg’s (2003) study of revisions made by Swedish university students in their LI. The 
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three studies revealed that expert modeling of writing prompts lowered the number of 
changes students made during the writing process (Bonk & Reynolds, 1990), that the 
computer-based writing task had a major influence on revisions resulting from problem¬ 
solving processes (Severinson-Eklundh & Kollberg, 2003), and that writing and editing 
behaviors varied depending on pre-response time, i.e. the amount of time spent before 
beginning writing (Epting, 2004). The two studies conducted by New (1999) and Kim 
(2002) have used keystroke logging to examine revision in the FL and L2 contexts, 
respectively. Using Systeme-D, New (1999) found that advanced intermediate level learners 
of French (FL) were mainly concerned with making changes for form rather than for content 
in computer-aided writing and that their linguistic concerns and lack of explicit instruction 
of revision and computer strategies hindered the reviewing and reworking of their texts. Kim 
(2002) made use of Track Changes functions in Microsoft Word to identify the textual 
changes made by her English-as-a-second-language (ESL) subjects and found that 
intermediate writers made more mechanics changes than advanced writers and that both 
groups made the largest number of changes in grammar. 

Other researchers have used keystroke logging to examine revising in both LI and 
L2/FL. Using Trace-it to compare the online revisions made by undergraduate writers in LI 
(English) and FL (Gennan), Thorson (2000) found that writers made more immediate and 
distant revisions in Gennan than in English. The two studies of Lindgren (2005) and 
Lindgren and Sullivan (2006) used keystroke logging to examine revising in LI (Swedish) 
and English-as-a-foreign-language (EFL) writing. Both studies found that writers undertook 
more pre-contextual revision of both form and concepts in their English texts than in their 
Swedish texts. The latter study also indicated that writers revised more at the point of 
inscription in FL than in LI. Stevenson et al. (2006) used Trace-it to compare online 
revisions made by 22 Dutch junior high school writers composing four argumentative 
essays, two in Dutch and two in English (FL). Their results indicated that writers made 
revisions at the linguistic level more frequently in FL than in LI and that little relationship 
was found between revision frequencies and text quality. 

Some researchers used keystroke logging to compare how students revise in the 
computer-based mode and in the handwriting mode. Collier (1983), and Collier and Werier 
(1995) video-recorded the computer screen activities of 3 professional writers to compare 
their computer revisions with their pen and paper ones. While the results of the first study 
showed that revision using a word processor is more complex than pen and paper revision, 
data obtained from the second study suggest that both writing modes are ‘equally effective 
in creating texts’. Similar to Collier’s (1983) results, Lutz (1987) found that her subjects, 
professional writers and experienced PhD student writers, spent more time writing, produced 
less text and made more changes on the computer task than on the pen and paper task. Lam 
(1992) used computer stroke records and text analysis to analyze ESL student writers’ 
revision in relation to task mode (computer-based vs. pen and paper mode) and task type 
(expressive, persuasive and transactional). Her study showed that writers made more 
changes and revised more recursively in their computer drafts than in pen and paper ones, 
but did not revise differently in the three task types. Using ScreenRecorder, Owston et al. 
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(1992) compared the effect of computer-based writing and a paper and pen writing on text 
quality and revision strategies. Contrarily to the findings of many studies reviewed here, 
their study found that writing on a computer neither led to making more revisions nor 
producing a text with a better quality than writing using pen and paper. 


V.2. Studies on the Temporal Aspects of the Writing Process 

Many studies have made use of the logged data in exploring the temporal aspects of the 
writing process such as writers’ pausing and production rate. Using Keytrap, Van Waes 
(1991) compared the effect of writing in three different modes (pen and paper, computer 
with a 25-line screen, and computer with a 66-line screen) on the time spent writing, text 
length, pause behaviour and revisions made. Van Weas’s study revealed that compared to 
pen and paper writers, computer writers spent more time writing the first draft, produced 
longer text, paused more in the beginning of the sentences and spent less time on pauses 
between sentences and paragraphs, revised more at the letter level and less at the word level. 
But the total number of revisions, total writing and pausing time were comparable in 
computer and pen and paper writing modes. Replicating the same study, Van Waes and 
Schellens (2003) reached similar results. 

The effort and time allocated to writing subprocesses were examined by Levy and 
Ransdell (1994, 1995, 1996) through videotaping of the computer screen and EventLog for 
analyzing the data, and by Piolat, Kellogg and Farioli (2001) using SCRIPTKELL. Levy and 
Ransdell’s three studies indicated that writers devoted more time to generating text and 
planning ideas than to reviewing and revising and that the time devoted to each of these four 
subprocesses varied at the different phases of the composing task. Similar results were 
reached by Piolat et al. whose study revealed that ‘planning and translating decreased across 
the first, second, and third phases of writing, whereas evaluation, revision, and execution 
increased slightly’. Examining the temporal differences between computer writing and pen 
and paper writing using reaction time tasks, Kellogg and Mueller’s (1993) two experiments 
revealed that the use of word processor restructured the writing process in that writers 
allocated more working memory efforts to planning and reviewing when using the computer 
than when writing longhand. In a naturalistic case study employing the videotaping of the 
computer monitor, Ballard (1994) looked at the pausing behaviour of a professional ESL 
writer while composing different tasks using a word processor. Ballard’s study indicated that 
puase length was longer prior to T-units than within T-units and that fluent writing, 
measured in terms pause length, had few pauses greater than 10 seconds. Janssen et al. 
(1996) used Keytrap to examine writers’ pausing while they think-aloud. The logged data 
provided them with evidence that the think-aloud method does influence writers’ pausing 
due to its reactivity. 

Spehnan Miller (2000, 2006) looked at the pause related phenomena (pause duration, 
pause frequency, pause rate and pause location) and rate of production (within text span) of 
10 LI (English) and 11 ESL university student writers of English through the micro analysis 
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of the keystroke logging data. Results showed that ESL writers tended to pause longer than 
native-English writers at all locations, particularly at clause and sentence completion points 
and that they had a lower rate of production. Rate of production has also been addressed in a 
recent study by Hayes and Chenoweth (2006) who used articulatory suppression, a 
technique that reduces working memory, to test the hypothesis that verbal working memory 
is not involved in typing and editing the text. The logged data they collected did not confirm 
the tested hypothesis as writers in the articulatory suppression condition typed their texts 
significantly more slowly and made significantly more errors than they did in the control 
condition. In their attempt to develop a new index for measuring writing fluency, Matsuno, 
Sakaue, Morita and Sugiura (2007) used keystroke logging to observe LI and EFL writers’ 
production rate. Based on their findings that the Japanese writers of English had heavier 
production processing efforts or loads and were less fluent than native writers of Engilsh, 
they conclude that the reduction value of the processing loads can be an effective index for 
measuring writing fluency. 


V.3. Studies on Using the Logged Data to Stimulate Writers’ Retrospection 

Another area of the computer-aided writing process research is using the logged data to 
stimulate writers’ retrospective accounts of their composing strategies and to raise their 
awareness of them. Two early attempts of that kind were made by Flinn (1987a) who used 
the data logged by COMPTRACE to stimulate 2 sixth graders’ retrospective thoughts about 
their revising strategies, and Sire (1988) who used the real-time playbacks of the computer 
keystroke-recorded writing sessions to stimulate his subjects’ retrospective accounts of their 
composing process. These two studies revealed that replaying the logged data was very a 
helpful tool in stimulating subjects’ retrospective thoughts about their composing decisions. 
Likewise, Ransdell (1995) found that using the real-time replay of the keystrokes, generated 
by special memory-resident software during word processing, to stimulate her subjects’ 
retrospective accounts while watching their letters display is less intrusive than having them 
think aloud while composing. Replaying the data logged by Trace-it in stimulated recall 
sessions, Sullivan and Lindgren (2002) found that these sessions helped the subjects become 
aware of their writing behaviours and approach the writing tasks differently. Similarly, 
Lindgren and Sullivan’s (2003) study indicated that stimulated recall, when used together 
with keystroke logging, raised students’ awareness about revising and increased their non¬ 
surface revisions. 


V.4. Studies on the Writing Process as a Whole 

Some few studies employed the computer-aided methods to investigate the writing process 
as a whole without focusing on a specific component or aspect of it. These include the 
studies conducted by Bridwell-Bowles et a/. (1985, 1987), Balkema (1985) and Bisaillon 
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(1997) who used a video-recorded computer-based task to examine the composing process 
of four advanced university learners of French as an L2. These four studies revealed that 
writing expertise played an important role employing the handwritten-task composing 
behaviours on the compute-based tasks (Bridwell-Bowles et al., 1985), and that computer 
writers made more superficial changes than meaningful changes to the text (Balkema, 1985), 
and spent more time on pausing than on other subprocesses (planning, text production and 
revising) (Bridwell-Bowles et al., 1987) and on correcting the written text, at the word level 
in particular, than on generating ideas (Bisaillon, 1997). 


V.5. Studies on the Other Aspects of the Writing Process 

The logged data has been used to explore other different aspects of the writing process. 
Webb (1992) investigated undergraduates' reading during composing by recording their 
transcription and cursor movements through which he inferred their reading behaviours 
based on the implied vision of the text. His study revealed that writers who reread their texts 
more did not necessarily rewrite more and that some writers’ decisions about content and 
structure occurred prior to transcribing the text while others organized their texts while 
transcribing it. Using JEDIT and S-notation to investigate the non-linearity of the computer- 
based text, Severinson-Eklundh (1994) confirmed the hypothesis that writers composing on 
the computer developed their texts in a non-linear way, i.e. in a different order from that of 
the final presentation of the text. 

Other aspects examined in the keystroke logging research of the writing process 
include students’ use of integrated graphic media while composing (Johnson, 1992) and 
their use of spelling checker (Figueredo, 2006). Johnson’s study showed that the writers who 
used graphic production tools to incorporate pictures or drawings into their reports made 
broader and more extended use of other program components than writers who did not make 
use of these tools. Figueredo’s study, on the other hand, revealed that children participants 
whose typing skill accounted for differences in the length of their stories used the spell 
checker most often to correct misspellings and that they were mostly successful at choosing 
the correct words on the spell checker's list, regardless of its position on the list. 

The logged data has also been used to examine the influence of some explanatory 
variables on the writing process such as word-processing-based teaching vs. non-word¬ 
processing teaching of writing (Benesch, 1987; Radziemski, 1995) and electronic writing 
proficiency (Youngquist, 2003). Benesch found that her 3 ESL college students made 
extensive revisions with pen and paper but they did not use computer time to revise, rather 
they used it for generating ideas, editing, and gaining familiarity with the technology. 
Radziemski’s study showed a positive effect for word processing instruction on improving 
text quality. Youngquist’s (2003) study indicated that neither word processing skills nor 
perceptions determined the quality of essays, but instruction that integrated technology to 
support the pedagogy of the writing classroom resulted in enhancing the quality of students’ 
work. 
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Some other published works on the computer-aided study of the writing process (e.g. 
Flinn, 1987b; Jakobsen, 2006; Scott & New, 1994; Smith, Lansman & Weber, 1990) were 
mainly concerned with presenting some software and explaining the advantages it offers. 


VI. CONCLUSION 

This article has reviewed the software used for recording and analyzing the real-time data of 
the writing process, and the different trends of writing process research employing the 
keystroke logging method. Compared to other related methods, keystroke logging is an 
unobtrusive, effective and simple way of collecting data about the composing process. In 
addition, the data offered by this method about writers’ revision and the temporal aspects of 
their processes in particular are unobtainable by other methods. The logged data have 
deepened our understanding of these two important dimensions of the writing process and 
enabled writing researchers to have a clearer and more accurate picture about them. Results 
on using the logged data to stimulate students’ retrospective accounts of their writing 
process are also promising in that they have shown the effectiveness of this type of data as a 
secondary research method and as a teaching tool that can inform teachers about their 
students’ writing process and computer writing proficiency and help students become aware 
of their own writing strategies. Given that computers are increasingly used for performing 
different writing tasks, we might expect that keystroke logging will be the most commonly 
used data source in writing process research in the years to come. 

As the review has shown, however, analyzing and interpreting the logged data is not 
an easy process. The area of analyzing the logged data is still lacking truly valid taxonomies 
for analyzing this kind of data; these analysis categorizations need to be consistent with 
those developed for analyzing writing process introspective and retrospective data. An 
optimal way of making the most use of keystroke logging data is to combine it with other 
data sources, such as verbal reports, that can help us identify writers’ internal processes. 
More research is needed on how combining keystroke logging with different data sources 
can enhance our understanding of the writing process. This approach has been adopted by 
some of the above reviewed studies, e.g. Lindgren (2005) and Stevenson et al. (2006). 
Recently, keystroke logging data has been combined with eye-tracking, a technique that can 
provide important information about what writers do while composing particularly in their 
monitoring and revision operations and about the interaction between perception and 
production while composing (Stromqvist et al., 2006: 62, 70). 

As the above review of the previous writing process studies employing keystroke 
logging shows, the majority of these studies have investigated either writers’ revisions or the 
temporal aspects of their composing. More research is needed on the writing process aspects 
that have received little attention in the previous studies such as reviewing strategies and 
using the referencing features of word processors. In addition, the findings reached by the 
previous studies reviewed about writers’ revisions or the temporal aspects of their 
composing process need to be documented by further research employing keystroke logging. 


© Servicio de Publicaciones. Universidad de Murcia. All rights reserved. 


IJES, vol. 8 (1), 2008, pp. 29-50 



A State-of-the-Art Review of the Real-Time Computer-Aided ... 


45 


It can be noted also that only thirteen out of the forty-six studies reviewed in the above 
section have used keystroke logging in exploring L2/FL composing, and that none of the 
studies reviewed has used this method in investigating the writing process in some 
languages such as Chinese, Spanish or Arabic. Accordingly, more writing process research 
making use of the logged data is needed in the L2/FL contexts and in these unexplored 
cultural backgrounds. Finally, more research is needed to further document the pedagogical 
implications of the keystroke logging method. 


NOTES 

1 Writing this article has mainly depended on my PhD research-related readings. I would 
like to acknowledge the support given to my research by the International Research 
Foundation for English Language Education (TIRF), and thank its Board of Trustees for 
granting me the 2008 Sheikh Nahayan Doctoral Dissertation Fellowship award. 
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